Self-Evaluating Compilation Applied to Loop Unrolling
نویسندگان
چکیده
Well-engineered compilers use a carefully selected set of optimizations, heuristic optimization policies, and a phase ordering to produce good machine code. Designing a compiler with one heuristic per optimization that works well with other optimization phases is a challenging task. Although compiler designers evaluate the optimization heuristics and phase ordering before deployment, compilers typically do not statically evaluate nor refine the quality of their optimization decisions during a specific compilation. This paper identifies a class of optimizations for which the compiler can evaluate the effectiveness of its heuristics and phase interactions statically, and when necessary re-run optimization phases, using information from the evaluation phase to guide its heuristics. We call this approach self-evaluating compilation (SEC). This model avoids some of the difficulties of predicting phase interactions, and perfecting any one heuristic. The SEC model was motivated by loop unrolling and other optimizations for the TRIPS architecture. TRIPS has a limit on instructions that the compiler can place in an atomic execution unit (a TRIPS block), yet each block has a fixed minimum cost. The goal of loop unrolling (and other optimizations) is to produce as full a block as possible without exceeding the block size, since an unnecessary block with a small number of instruction degrades performance. Because unrolling enables downstream optimizations, it needs to occur well before code generation, but this position makes it impossible to predict the final number of instructions. However, eventually the compiler generates code and can, on a per-loop basis, determine if it unrolled too much or too little or just right. If need be, SEC unrolling then goes back and adjusts the unroll amount accordingly and reruns subsequent optimization phases. We demonstrate a prototype SEC unrolling implementation that automatically matches the best hand unrolled version for a set of microbenchmarks on the TRIPS architectural simulator. Although motivated by TRIPS compilation challenges, SEC is broadly applicable to helping solve compilation phase ordering and heuristic design for resource constraints such as register and code size limitations which can be measured statically and occur when compiling for embedded, VLIW, and partitioned hardware.
منابع مشابه
A Simulation Methodology for Software Energy Evaluation
We describe a comprehensive simulation methodology and tool for evaluation of software energy for the pipelined DLX processor. Energy models for each module of DLX are built and the energy is evaluated during run time execution. The input to the simulator are the instructions of the program and the simulator estimates energy of each micro-instruction using the energy models. Our simulator allow...
متن کاملConvergent Compilation Applied to Loop Unrolling
Well-engineered compilers use a carefully selected set of optimizations, heuristic optimization policies, and a phase ordering. Designing a single optimization heuristic that works well with other optimization phases is a challenging task. Although compiler designers evaluate heuristics and phase orderings before deployment, compilers typically do not statically evaluate nor refine the quality ...
متن کاملImpact of Loop Unrolling on Area, Throughput and Clock Frequency in ROCCC: C to VHDL Compiler for FPGAs
Loop unrolling is the main compiler technique that allows reconfigurable architectures achieve large degrees of parallelism. However, loop unrolling increases the area and can potentially have a negative impact on clock cycle time. In most embedded applications, the critical parameter is the throughput. Loop unrolling can therefore have contradictory effects on the throughput. As a consequence ...
متن کاملModeling Loop Unrolling: Approaches and Open Issues
Loop unrolling plays an important role in compilation for Reconfigurable Processing Units (RPUs) as it exposes operator parallelism and enables other transformations (e.g., scalar replacement). Deciding when and where to apply loop unrolling, either fully or partially, leads to large design space exploration problems. In order to cope with these vast spaces, researchers have explored the applic...
متن کاملLoop Transformations in the Ahead-of-Time Optimization of Java Bytecode
Loop optimizations such as loop unrolling, unfolding and invariant code motion have long been used in a wide variety of compilers to improve the running time of applications. In this paper we present a series of experimental results detailing the effect these techniques have on the running time of Java applications following ahead of time optimization. We also detail the optimization tools and ...
متن کامل